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A PARAMETRIC INVESTIGATION OF RIDE QUALITY RATING SCALES 
Thomas K. Dempsey, Glynn D. Coates,* and Jack D. Leatherwood 

ABSTRACT 

An experimental investigation was conducted to determine (1) the relative merits of various category 
scales for the prediction of human discomfort response to vibration and (2) the mathematical 
relationships that allow for transformations of subjective data from any one scale to any other scale. 

A total of 16 category scales were studied and these represented various parametric combinations of 
polarity (i.e., unipolar and bipolar), scale type (continuous or discrete), and number of scalar points 
(3, 5, 7, or 9). Sixteen subject groups (12 subjects per group) were used and each subject group evaluated 
their comfort/discomfort to vertical sinusoidal vibration using one of the rating scales. The experimental 
apparatus utilized was the Langley Research Center's Passenger Ride Quality Apparatus which can expose six 
subjects simultaneously to predetermined vibrations. For this study, the vibration stimuli were composed 
of repeats of selected sinusoidal frequencies (1, 2, 4, 5, 8, 10, 15, and 20 Hz) applied at each of nine 
peak floor acceleration levels (0.05, 0.075, 0.10, 0.125, 0 15, 0.175, 0.20, 0.225, and 0.25 g). 

Results indicated that a higher degree of reliability and discriminability were generally obtained 
from unipolar, continuous type scales containing either seven or nine scalar points as opposed to the 
other scales investigated. Furthermore, transformations of subjective data between catego*'y scales was 
found to be possible with the unipolar scales with the larger numbers of scalar points giving the greatest 
accuracy of transformation. A result of particular interest was that the comfort (or positive) half of a 
bipolar scale was seldom used by subjects to describe their subjective reaction to vibration. 
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INTRODUCTION 


The ride quality literature over the past 50 years has emphasized the importance of passenger 
reactions to vibration in the development of comfort criteria for use in vehicle design. A recent review 
{ref. 1) of the criteria literature points out that many differences and contradictions exist in the 
various reported investigations and that one possible contributing factor to these large differences 
could be the use of inappropriate scaling techniques. Invariably, during formal (e.g., refs. 2 and 3) 
or informal ride quality meetings, rating scales are discussed and viewed as a major (if not the greatest) 
cause of this criteria variability. The purpose of this study is to determine the relative merits of 
various rating scales, and to determine the mathematical relationship that will allow transformation of 
subjective data between various scales. 

The large number of rating scales that have been used and discussed can be characterized according 
to (1) the adjectives or adverbs that are used for anchoring scalar points, (2) polarity; whether or not 
a passenger is allowed to evaluate his ride sensation in a unipolar or bipolar fashion, (3) scale type; 
either the category scale is of a line variety and continuous in nature, or consists of category boxes 
of a discrete nature, and (4) the number of scalar points or category demarcations provided on the scale. 
Many discussions among ride quality investigators have centered upon the question of which of these 
scales is the "most appropriate" for use in the development of ride quality criteria. Answers to these 
questions could be determined from experimental tests of (1) reliability; the determination of which 



scale allows subjects to display the greatest repeatability in subjective evaluations, (2) discriminability 
an assessment of which subjective scale allows the subjects to provide maxiraura discrimination between 
vibration spectrum characteristics, and (3) flexibility of the scale in allowing transformation of the 
subjective responses to other scales reducing iat is merely an apparent variability between comfort 
criteria. 

The investigation of different adjective anchors is not considered in the present paper since it 
would present an almost endless search for the "most appropriate" subjective scale. Consequently, the 
present study selected as an anchor for all scale variations the adjective "comfort -discomfort" which 
is probably the simplest and most frequently occurring adjective used in this type of study. 

The purpose of the present study is, therefore, to conduct a parametric investigation of scale 
polarity, scale type, and scalar points. These different scales are to be evaluated in terms of the 
previously mentioned factors of scale appropriateness, namely, reliability, discriminability, and the 
ease or flexibility governing transformation of subjective data from one scale to another. 



METHOD 


Simulator 

The apparatus used was the Langley Passenger Ride Quality Apparatus (PRQA). The PRQA is described 
briefly in this section and a detailed description can be obtained from references 4 and 5. The PRQA 
and associated programing and control instrumentation are shown in the photographs of figure 1 on the 
next page. Figure 1(a) shows the waiting room where subjects are instructed as to their participation 
in the experiment, complete questionnaires, etc. Figures 1(b) and 1(c) are photographs of the exterior 
of PRQA, and it should be noted that the actual mechanisms which drive the simulator are located beneath 
the pictured floor. Shown in figure 1(d) is a model of the PRQA indicating the supports, actuators, and 
restraints of the three-axis drive system. The control console is shown in figure 1(e) and is located 
at the same level as the simulator to allow the console control operator to constantly monitor subjects 
within the simulator. An interior view of PRQA fitted with tourist-class aircraft seats is shown in 
figure 1(f). Additional interior views (with front or back panels removed) of PRQA are displayed in 
figures 1(g), 1(h), and l(i). To reduce the influence of extraneous noises produced by the equipment, 
music was played in the PRQA. In addition, each subject was requested to use ear plugs (see ref. 6). 















Subjects 


A total of 192 subjects participated in the study. The volunteer subjects were obtained from Old 
Dominion University (undergraduate students) and from a contractual subject pool, and were paid for 
their participation in the study. The ages and weights of the subjects are listed in the following 



TABLE I.- SUBJECT DB-IOGRAPHICS 
Subject Age Weight 


Sex 

Number 

Median 

Range 

Mean 

Standard 

Deviation 

Males 

61 

21 

18-46 

165.98 

21.88 

Females 

131 

21 

18-55 

129.05 

24.31 

Total 

192 

21 

18-55 

140.78 

29.15 


Subjective Evaluation Scales 


A total of 16 different scales were investigated in the oresent study. These scales were parametric 
combinations of polarity (unipolar or bipolar), scale type (continuous or discrete), and number of 
scalar points (3, 5, 7, or 9 points). The exact scales are displayed below in figures 2{a-p). 



(a) UNIPOLAR. CONTINUOUS. SCALAR POINTS 

ZERO DISCOMFORT 
COMFORTABLE 
NEUTRAL 

0 +1 
I L_, 

(bl UNIPOLAR. CONTINUOUS. SCALAR POINTS 
ZERO DISCOMFORT 
COMFORTABLE 
NEUTRAL 

0 +1 +2 
I I 1_ 

(c) UNIPOLAR. CONTINUOUS. SCALAR POINTS 
ZERO DISCOMFORT 
COMFORTABLE 
NEUTRAL 

0 +1 +2 +3 


(d) UNIPOLAR. CONTINUOUS. SCALAR POINTS 
ZFRO DISCOMFORT 
COMFORTABLE 
NEUTRAL 

0 +1 +2 +3 +4 


Figure 2 
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(e) BIPOLAR. CONTINUOUS, SCAIAR POINTS = 3 

DISCOMFORT 

MAXIMUM COMFORTABLE 

DISCOMFORT NEUTRAL 

-1 0 

I I 

(f) BIPOLAR. CONTINUOUS. SCALAR POINTS = 5 

ZERO DISCOMFORT 

MAXIMUM COMFORTABLE 

D ISCOMFORT NEUTRAL 

-2 -1 0 
I I 1 

(g) BIPOLAR, CONTINUOUS, SCALAR POINTS 1 

ZERO DISCOMFORT 

MAXIMUM COMFORTA BLE 

DISCOMFORT NEUTRAL 

-3 -2 -1 0 rl 


(h) BIPOLAR, CONTINUOUS, SCALAR POINTS = 9 

ZERO DISCOMFORT 

MAXIMUM COMFORTABLE 

DISCOMFORT NEUTRAL 

-4 -3 -2 -1 0 +1 


Figure 2 (Continued! 


MAXIMUM 
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+ 1 




{]) UNIPOLAR, DiSCRETE, 

SCALAR POINTS = 3 



ZERO DISCOMFORT 




COMFORTABLE 



MAXIMUM 

NEUTRAL 



DISCOMFORT 

0 

+ 1 


+ 2 

□ 

□ 


□ 

(j) UNIPOLAR. DISCRETE, 

SCALAR POINTS = 5 



ZERO DISCOMFORT 




COMFORTABLE 



MAXIMUM 

NEUTRAL 



DISCOMFORT 

0 

+ 1 +2 

+ 3 

+ 4 

□ 

□ □ 

n 
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(k) L'NIPOLAR, DISCRETE, SCALAR POINTS = 7 


ZERO DISCOMFORT 




COMFORTABLE 


i. 


NEUTRAL 




0 +1 

+ 2 

+ 3 

+ 4 

n 

□ 

□ 

□ 

□ 


MAXIMUM 
DISCOMFORT 
+ 5 +6 

□ □ 


(i) UNIPOLAR, DISCRETE. 

SCALAR POINTS = 

9 



ZERO DISCOMFORT 







COMFORTABLE 






MAXIMUM 

NEUTRAL 






DISCOMFORT 

0 +1 

+ 2 

+ 3 

+ 4 

+ 5 

+ 6 

+ 7 +8 

□ O 

□ 

□ 

□ 

□ 

□ 

□ □ 


Figure 2 (Continued' 



(m) BIPOLAR. DISCRETE. 

SCALAR POINTS - 3 


ZERO DISCOMFORT 

MAXIMUM 

COMFORTABLE 

DISCOMFORT 

NEUTRAL 


0 

□ 

□ 

(n) BIPOLAR. DISCRETE. 

SCALAR POINTS = 5 


ZERO DISCOMFORT 

MAXIMUM 

COMFORTABLE 

DISCOMFORT 

NEUTRAL 

-2 

1 0 

□ □ □ 

(0) BIPOLAR. DISCRETE, 

SCALAR POINTS = 7 


ZERO DISCOMFORT 

MAXIMUM 

COMFORTABLE 

DISCOMFORT 

NEUTRAL 

-3 -2 

-1 0 +1 

□ □ 

□ □ □ 

(p) BIPOLAR. DISCRETE, 

SCALAR POINTS = 9 

ZERO DISCOMFORT 

MAXIMUM 

COMFORTABLE 

DISCOMFORT 

NEUTRAL 

-4 -3 -2 

-1 0 +1 

□ □ □ 

□ □ □ 


Figure 2 (Concluded) 


/VIAXIMUM 

COMFORT 

■M 

□ 


AAAXIMUM 
COMFORT 
+ 1 +2 

□ □ 


MAXIMUM 
COMFORT 
+ 2 +3 

□ □ 


MAXIMUM 
COMFORT 
+ 2 +3 +4 
□ □ □ 



Subject Instruction 


The subjects were instructed to base evaluations upon the comfort (or discomfort) of a vibration. 
Prior to the start of testing for each session, the subjects were exposed to a vibration (4 Hz, 0.25 
peak g) and told the vibration usually resulted in a rating of maximum discomfort. The subjects were 
purposely not given a vibration typical of maximum comfort since such a vibration is difficult to 
specify and would in fact bias results related to polarity. 



Procedure 


Sixteen groups of subjects (composed o' 12 subjects per group) each used one of the previously 
mentioned category scales to evaluate successive "ride segments." A ride segment, as displayed in 
Table II v/as a single vertical frequency (1, 2, 4, 5, 8, 10, 15, and 20 Hz) at one of nine peak floor 
acceleration levels (0.05, 0.075. 0.10, 0.125, 0.15, 0.175, 0,20, 0.225, and 0 25g). The factorial 
combination of these frequencies and acceleration levels resulted in a total of 72 separate ride 
segments each of which was presented to a subject twice (in order to determine estimates of reliability) 
for a total of 144 ride segments. The eight frequencies were randomized without repl accent (twice) 
and were used to define the frequency of vibration of a session. The nine peak floor acceleration levels 
were randomized and determined the nine ride segments of a session. Through the use of a two-way auditory 
communication system, the subjects were instructed when to begin evaluation of a ride by the word "start" 
and when to end the evaluation by the word "stop." The onset and offset of a vibration each lasted 
5 seconds, the duration of the vibration was 10 seconds, and the interstimulus interval 5 seconds. The 
subjects were further instructed to ignore rise (onset) and decay (offset) vibrations that occurred prior 
and subsequent to the words "start" and "stop," respectively. 

Each session lasted approximately 5 minutes, with a 1 minute rest period after each session. A 
15 minute rest interval v/as provided after the eighth session instead of the 1 minute interval. 




0.250 


II.- EXPERIMENTAL DESIGN 


Frequency 

2 4 5 8 10 15 20 



RESULTS AND DISCUSSION 


This section provides results and discussion in terms of the factors previously described for 
determining scale appropriateness; namely, reliability, discriminabil ity, and flexibility of response 
transformation. Within each of these sections the scale characteristics of polarity, scale type, and 
number of scalar points are addressed. 


Scale Reliability 

The extent to which a category scale allows a subject to repeat evaluations to similar vibrations 
would certainly be an initial requirement of scale appropriateness. Relative to this requirement, 
the reliability of scales varing in polarity, scale type, and scalar points are discussed in successive 
sections. 

Polarity .- Figure 3 on the follov/ing page displays the test-retest reliability correlation coef- 
ficients for unipolar and bipolar scales. These correlations include the paired data for different 
frequencies, acceleration levels, scale type, scalar points, and subjects (N = 6,912 pairs). A z-score 
{z-score transformation test) of 2.882 indicated there was a statistically (P<.05) higher degree of 
reliability obtained through the use of unipolar than bipolar scales. 




Scale type .- Figure 4 displays the test-retest correlation coefficients obtained for discrete and 


continuous type scales. In this case, each correlation was based on paired data for different frequencies, 
acceleration levels, polarity, scalar points, and subjects (N = 6,912 pairs). A z-score of 6.412 
indicated there was a statistical difference (P<.05) between these two correlations. The results indicate 
that a significantly higher degree of reliability will be obtained for continuous rather than discrete 
type scales. 


CO 




Scalar points .- Figure 5 displays the test-retest correlation coefficients obtained for 3, 5, 7, or 
9 scalar points. In this case, each correlation was based on paired data for different frequencies, 
acceleration level, polarity, scale type, and subjects {N - 3,456 pairs). A series of z-score tests 
between these correlation coefficients indicated that there was no difference between 3 or 5 scalar 
points, or between 7 and 9 scalar points. However, there was a statistically higher degree of reliability 
obtained for 7 and 9 scalar points in comparison to 3 or 5 scalar points (z-scores = 0.7469, 5.3527, 
6.2656, 6.0996, 7.0124, and 0.9129 for scalar point comparisons of 3 vs. 5, 3 vs. 7, 3 vs. 9, 5 vs. 7, 

5 vs. 9, and 7 vs. 9, respectively.) 

Reliability summary .- The results from this series of analyses indicate that higher degrees of 
reliability will be obtained from certain category scales for evaluation of vibration than other scales. 
The scales that display the greater reliability are of a unipolar, continuous nature with 7 or 9 
scalar points. 





Scale Discriminabllity 


This section addresses the problem of which category scale in terms of polarity, scale type, or 
number of scalar points allows subjects to provide maximum discrimination between ride spectrum 
characteristics. However, there are a variety of mathematical relationships that could exist between 
the category subjective responses and a particular physical measure for the description of 
discrimination accuracy. The four mathematical relationships (psychophysical formulations) typically 
discovered are displayed in Table III, where x is the peak acceleration level and a and b are coefficients 
determined from appropriate least-square fitting techniques. Therefore, the accuracy of discrimination 
associated with variations of polarity, scale type, and number of scalar points will be determined for 
each of the mathematical formulations. 


ro 



TABLE III.- PSYCHOPHYSICAL RELATIONSHIPS 

(1) Power ratings = ax^ 

(2) Logarithmic ratings = a + blogx 

bx 

(3) Exponential ratings = alO 


(4) Linear ratings = a + bx 



Polarity .- Figure 6(a-d) displays the correlation coefficients between subjective responses and 


vibration measures for both unipolar and bipolar scales, for each of the previously nentiwied wathe- 
matical formulations. The data for each correlation was based on paired data (subjective responses 
and vibration measures) for different frequencies, acceleration levels, repeats of both frequencies 
and acceleration levels, scale type, scalar points, and subjects (N = 13,824). However, despite 
the fact that the correlations were based on twice the number of data pairs as were certain estimates 
of reliability, the number of pairs used for computation of z-score tests was 144. This number vras 
selected so as not to artificially inflate the degrees of freedom. The z-score tests indicated that 
there was no statistical difference between unipolar or bipolar scales for any of tte mathematical 
formulations (z-scores - 1.327, 0.957, 1.327, and 1.066 for the linear, logarithmic, exponential, and 
power comparison of scale polarity, respectively). There is a systematic trend of unipolar scales 
offering a greater accuracy of discrimination between vibratijn • easures than bipolar scales. In fact, 
the z-scores indicate that by chance such differences between correlation coefficients would occur only 
10 to 15 percent of the time. 

Additional z-score tests were computed between the responses of different mathematical descriptions 
of the same type of scale. For example, it was problematical lAether or not there any diff ^nce between 
a linear or logarithmic description of the relationship between responses and vibration measure for a 
unipolar scale, etc. There were no statistical differences obtained between any mathematical 
formulations of these relationships for either scale. The implication of these results being that the 
simpler linear relationship can be selected for description of the mathematical relationship. 
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Figure 6 




Scale type .- Figures 7(a-d) display the correlation coefficients between subiective responses and 
vibration measures for both continuous and discrete scales for each of the mathematical formulations. 

The number of data pairs for computation of these correlations and restriction of the degrees of freedom 
for computation of z-score tests are identical to those for polarity analyses . 

There was no statistical difference between the correlations for continuous and discrete type scales 
for any of the mathematical formulations (z-scores = 0.865, 0.957, 0.999, and 1.066 for linear, 
logarithmic, exponential, and power comparisons of continuous and discrete scales, respectively). The 
figures do indicate a trend that continuous type scales allow a greater accuracy of discrimination than 
discrete scales. In addition, the z-scores for the comparisons were of sufficient magnitude to indicate 
differences between the scale wo. Id occur only 15 to 20 percent of the time. The implication is that 
the evidence (although not conclusive) suggests that a continuous rather than a discrete type scale 
should be used for the investigation of subjective reactions to vibration. 

Similar to polarity analyses, there were not statistical differences between various psychophysical 
descriptions. Again, the implication is that selection of the simpler linear relationship is 
appropriate for description of the psychophysical relationship. 
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Scalar points .- Figure 8 (a-d) shows the correlation coefficients between subjective responses and 


vibration measures for category scales of 3, 5, 7, or 9 scalar points, for each of the mathematical 
formulations. Information and restrictions regarding the number of data pairs is identical to that for 
polarity and scale type analyses. 

The 2-scores obtained from comparison of the discrimination accuracy of these category scales with 

different numbers of scalar points are displayed in Table IV. These result's indicate that the nine-point 

/ 

scale allows a significantly \'P<.05) greater degree of discrimination accuracy than three-point or five- 
potnt (for some comparisons) scales. Analogous to comparisons between scalar points for reliability, 
these data for discrimination indicate no difference between three- or five-point scales, or between 
seven- and nine-point scales, but a trend of a higher degree of discrimination accuracy for seven or nine 
than for three or five-point scales. 

Similar to polarity and scale type analyses, there were no statistical differences between mathe- 
matical descriptions for any of the category scales varying in number of points. Consequently, several 
types of analyses imply that the linear law can be selected for description of the psychophysical 
relationship. 
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TABLE IV.- SUMMARY OF Z-SCORES FROM COMPARISON OF CATEGORY SCALES OF 


DIFFERING NUMBERS OF SCALAR POINTS 


Psychophysical 

Relationship 


Scalar Points Compared 



3 vs. 5 

3 vs. 7 

3 vs. 9 

5 vs. 7 

5 vs. 9 

7 vs. 9 

Linear 

.092 

-1.511 

-1 .763* 

-1.419 

-1.671 

-.252 

Logarithmic 

.000 

-1.545 

-1.688* 

-1.545 

-1.688* 

-.143 

Exponential 

.185 

-1.511 

-1.763* 

-1 .327 

-1.579 

-.252 

Power 

.109 

-1.545 

-1.545 

-1.436 

-1 .436 

-.000 


*P<.05; z-score value >1.64 or <-1.64 needed to achieve statistical significance. 


Discriminabil ity sunrniary .- Due to restrictions associated with degrees of freedom, the discrirainability 
analyses were not as conclusive as those for reliability. There were, however, strong trends for 
discriminabil ity essentially in agreement with those for reliability. Specifically, the category scales 
that display trends of greater discriminabil ity are of a unipolar continuous nature, with either seven 
or nine scalar points. 



Scale Transformation 


The flexibility of a category scale in allowing transformation of the subjective responses to other 
scales is addressed in this section. Figure 9 shows typical transformation data. The figure displays 
cross plotting of responses from two different category scales, the responses of which were produced 
to the same vibration (e.g., frequency by acceleration level). The cross plotted data represents the 
mean response of 12 different subjects for each of the scales. The correlation coefficient between 
the responses of the two scales was -.98, and the standard error of estimate (standard deviation about 
the regression line) was 0.325. This latter value could be considered to i jpresent the accuracy of a 
particular scale in predicting responses of other scales. Table V displays the mean standard error 
of estimate obtained for a particular scale when used to predict responses of the other scales 
investigated. The criterion (predicted scores) were adjusted to a nine-point scale to allow direct 
comparison between standard errors of estimate. The standard error of estimate numbers were used in 
Table IV to provide a rank ordering of the category scales in terms of prediction accuracy. These data, 
as well as similar transformation data indicate: (1) transformation of subjective data between category 

scales is possible, (2) generally unipolar scales of a higher number of scalar points (seven or nine) 
allow the greatest accuracy of transformation, and (3) the comfort or positive ei.J of a bipolar scc 
is not used very often by subjects to describe their sensations. 
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TABLE V.- A SUMMARY OF THE CATEGORY SCALES RANKED FROM LOWEST TO HIGHEST 


IN TERMS OF MEAN STANDARD ERROR OF ESTIMATES 


Rank 

Scale 

Standard Error of Estimate 

1 

Uni pol ar-Conti nuous 

9 points 

.425 

2 

Uni pol ar-Di screte 

7 points 

.429 

3 

Bipolar-Interval 

7 points 

.451 

4 

Uni pol ar-Di screte 

9 points 

.466 

5 

Uni pol ar-Di screte 

5 points 

.474 

6 

Uni pol ar-Continuous 

7 points 

.474 

7 

Uni pol ar-Conti nuous 

3 points 

.489 

8 

Bipolar-Discrete 

7 points 

.491 

9 

Uni pol ar-Di screte 

3 points 

.498 

10 

Bipolaj )i screte 

9 points 

,506 

11 

Uni pol ar-Conti nuous 

5 points 

.509 

12 

Bi pol ar-Oi screte 

5 points 

.522 

13 

Bi pol ar-Conti nuous 

9 points 

.556 

14 

Bi pol ar-Conti nuous 

3 points 

.598 

15 

Bi pol ar-Conti nuous 

5 points 

.662 

16 

Bi pol ar-Di screte 

3 points 

.687 



CONCLUDING REMARKS 


Several major conclusions regarding category scales that can be derived from this study are: 

(1) higher degrees of reliability and discriminabil ity are generally obtained for unipolar continuous 
type scales of either seven or nine scalar points than for other scales, (2) transformation of 
subjective data between category scales is possible, (3) generally unipolar scales of a higher number 
of scalar points allow the greatest accuracy of transformation to other scales, and (4) the comfort 
or positive end of a bipolar scale is not used extensively by subjects for description of their 
sensations to vibration. 
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